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£N ■ Abstract 

>v 

' In software development process we come across various modules. Which raise the 

idea of priority of the different modules of a software so that important modules are 
tested on preference. This approach is desirable because it is not possible to test each 
module regressively due to time and cost constraints. This paper discusses on some 
parameters, required to prioritize several modules of a software and provides measure 

1 of optimal time and cost for testing based on non homogeneous Poisson process. 
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^ ! 1 Introduction 

> ; 

Whenever a software is developed a question about its reliability comes in front. We need 
| some tool to be sure that software is working properly. That is, there is a need of software 

£NJ ■ testing, to find out any faults that might exist, before releasing the product. For this 

■ purpose, software product is tested carefully but regressive testing is not feasible always, 

as it can be very expensive in form of cost and time both. Thats why, a modular testing 
is a suggestive approach so that the Testing Authority can test the software's important 
modules preferably and may save time and cost. 
. — i | It is impractical to test the software till all the bugs are removed, the tester should also 

be aware of the optimal testing time and cost required to test the modules. We also allow 
a bit of faults in the accepted range instead of making it 100% error free. For this reason, 
this paper attempts to provide an optimal boundary values for time and cost considering 
the actual percentage of faults obtained in testing. A project manager should be familiar 
with the points where it should stop testing and go for release or rejection. 

A lot of work has been done in the area of optimal software testing. McDaid and Wilson 
(2001) gave three plans to settle on the problem of decision - How long to test software? 
by introducing the optimal time measure [2]. Musa and Ackerman used the concept of 
reliability to make the decision [3]. Ehrlich, Prasanna, Stampfel and Wu also tried to find 
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out the cost of a stop test decision [4]. But one of the most suitable models for the problem 
of determining optimal cost and time is proposed by Goel and Okumoto [5]. They gave 
a non homogeneous Poisson process based model to determine the optimal cost and time 
for software [6] [7] . Praveen et al. enhanced their work by proposing a cumulative priority 
based elucidation to find out optimal software testing period [8]. 

In this paper, we consider the new idea of modular approach to test software. We sug- 
gest here to assign a weight on each modules depending on various parameters. Hierarchies 
of the modules also plays an imporatant role in decision as preceder module will always 
affact their dependent modules. We enhanced previous ideas by adding this hierachical 
module concept. 

The next section briefly explains background and related work. Section 3 provides 
the module prioritization schema based on various factors and our approach to test the 
software to determine that the software is OK for release or not. Section 4 brings an 
example where this approach is applied. Last section concludes finally. 

2 Background and Related Work 

2.1 Non homogeneous poisson process 

A Poisson process is one of the most significant random processes in probability theory. It 
is widely used to model random points in time and space such as the times of radioactive 
emissions, the arrival times of customers at a service center and the positions of flaws in 
a piece of material. Several important probability distributions arise naturally from the 
Poisson process. The Poisson process is a collection of random variables where N(t) is 
the number of events that have occurred up to time t (starting from time 0) [8]. The 
number of events between time a and time b is given as N(b).N(a) and has a Poisson 
distribution. A Non-Homogeneous process is a process with rate parameter A(t) such that 
the rate parameter of the process is a function of time e.g. the arrival rate of vehicles in 
a traffic light signal. 

2.2 Related work by Goel and Okumoto 

Faults present in the system causes software failure at random times. Let N(t) (where 
t > 0) be the cumulative number of failures at time t (either CPU time or calendar time). 
According to Goel and Okumoto [5], Let m(t) be the expected number of faults detected 
by time t can be shown as HJ 

m (t) = fl (i - e~ bt ) (1) 

where, m(oo) = a so that a represents the expected number of software failures to be 
eventually encountered and b is the detection rate for an individual fault. 

According to Goel and Okumoto, the operational performance of a system is to a large 
extent dependent on testing time. Longer testing phase leads to enhanced performance. 
Also, cost of fixing a default during operation is generally much more than during testing. 
However, the time spent in testing delays the product release, which leads to additional 
costs. The objective is to determine optimal release time to minimize cost by reducing 
testing time. Goel and Okumoto gave the parameters cx,C2,cs,t and T which are as 
follows: 
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C\ = cost of fixing a fault during testing 

C2 = cost of fixing a fault during operation (c2 > c±) 

C3 = cost of testing per unit time 

t = software life cycle length 

T = software release time (same as testing time) 

Since m(t) represents the expected number of faults during (0, t) the expected costs of 
fixing faults during the testing and operational phases are cim(T) and C2(m(t) — m(T)) 
respectively. Further, the testing cost during a time period T is c^(T). If there is a cost 
associated with delay in meeting a delivery plan, such a cost could be included in C3. 
Combining the above costs, the total expected cost is given by (2). 

C(T) = Cl m(T) + c 2 [m{t) - m(T)] + c 3 (T) (2) 

This policy minimizes the average cost and depends on the ratio of a * b and 

C r = c 3 /(c 2 - ci) (3) 

Two cases arise, ab > C r and ab < C r 

Case I : If ab > C r , the optimal policy is to take 

T*=min(T ,t) (4) 

where To = l/bln(ab/C r ) 

Case II : If ab <= C r , then T = 0. If the cost of testing or cost of delay in release are 
very high, the solution favors no testing at all i.e. T* = 0. 

On the other hand, if the cost of fixing a fault after release is very high as compared 
to the usefulness of the system, the solution will tend to favor not using the system i.e. 
T* = t. 

2.3 Related work by Praveen et al. 

This paper suggests prioritizing the software modules into 5 categories namely very high, 
high, medium, low and very low. Then they calculate optimal cost and time similar to Goal 
and Okumoto work. To find out maximum allowable cost and time stringency concept 
is used here. Stringency is the maximum allowable deviation from the optimum which is 
decided by the organization. 

Then they advise to start testing the software to calculate the actual time and actual 
cost for each priority category. The deviation from optimal testing time and optimal cost 
can be calculated from (5) and (6). 

Where, 

a = deviation from optimal time 
T a = actual testing time 

T* = optimal testing time calculated from (4), and 

^(Co^o) (6) 

Where, 
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(3 = deviation from optimal cost 
C a = actual testing cost 

Cq = optimal testing cost calculated from (2) 
Limiting factor 5 is given by (7) 



§ = a + (3 



(7) 



Afterwards they cumulatively calculate the limiting factor to determine whether further 
software testing is required. 

2.4 Related work by Ohba 

The above discussed models view the software as single unit, regardless of the structural 
or functional relationship among software subsystems (modules). Based on the concept 
of redundancy, recovery block techinique [15] and N-version program techinique [TJj s- 
independently produce multiple versions of the software to perform the same function. 

Most software reliability models assume s-independence of faults. However, Ohba [16] 
argues that faults are s-dependent because of the logical or functional dependency within 
a program. Ohba observed an S-shaped software reliability growth curve, as opposed to 
the exponential growth curve for the s-independence models. The model is characterized 



Unlike most software reliability models that use execution time, the S-shaped model is 
generally observed when calendar time is used. 

2.5 Musa-Okumoto 

Musa & Okumoto [17] proposed a logarithmic Poisson execution-time model where the 
observed number of failures by time t is NHPP. This model adds a decay parameter, and 
is characterized by: 



3 Proposed Approach 
3.1 Components Priority 

To ensure that the component prioritization is uniform and effective, it is imperative to 
introduce a schema [13]. The following parameters may be helpful to decide the priority 
of the components. 

Production Time This is the amount of work carried out by an employee on the 
project. This parameter keeps the track of total person hours for a module. Module 
priority will increase as Production time increases. 

Decision density High complexity may result in bad understandability and more 
errors. Complex procedures also need more time to develop and test. Therefore, excessive 
complexity should be avoided. Too complex procedures should be simplified by rewriting 
or splitting into several procedures. Complexity is often positively correlated to code size. 
A big program or function is likely to be complex as well. These are not equal, however. A 



by: 




(8) 



m(t) = (l/0).log(A.0.t + l) 



(9) 
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procedure with relatively few lines of code might be far more complex than a long one. We 
recommend the combined use of lines of code and complexity metrics to detect complex 
code. The total cyclomatic complexity for a module is calculated as follows. 

TCC = Sum(CC) - Count{CC) + 1 (10) 

Cyclomatic complexity is usually higher in longer procedures. How much decision is 
there actually, compared to lines of code? This is where you need decision density (also 
called cyclomatic density). 

DD = CC/LLOC (11) 

where LLOC id logical lines of codes. This parameter shows the average decision density 
of the code lines within the modules. 

Programming Path This parameter suggest that what environment for coding is 
used. Costs associated with technology required for the component. What are the impor- 
tance of current technology for this component. How much experts are available for such 
technologies. 

Size of Components How much code had done? 

Skill of fault reporters/resolvers Source of origin of fault suggested is how much 
reliable. Errors are reported technically or just by inexperience of user. Actually in our 
model, we consider that faults are collected using some bug tracking system which is open 
to customer too. 

Weight priority This includes the ranking given by developers, managers and cus- 
tomer based on the requirements and previous experiences. It also includes risk factors. 

Code reusability If an earlier source code can be used in the current work with 
little or no modifications then we call it code reusability. This lessens the requirements of 
testing the code again as it has already been tested earlier. 

Coupling It is the measure of connectedness of one module to another. It is given as- 



(j — i _ Q2) 

V (di + aci + d + bc Q + g d + cg c + w + r) / 

Where C = Coupling 

di = number of input data parameters 

Cj = number of input control parameters 

d = number of output data parameters 

c = number of output control parameters 

gd = number of global variables used as data 

g c = number of global variables used as control 

w = number of modules called (fan-out) 

r = number of modules calling the module under consideration (fan-in) 
the values of k and a, b and c may be adjusted as more experimental verification occurs 
[11]. 

Layout appropriateness For a specific layout (i.e., a specific GUI design), cost can 
be assigned to each sequence of actions according to the following relationship: 



cost = £ [frequency of transition(Zc) x cost of transition(fe)] 



(13) 
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where A; is a specific transition from one layout entity to the next as a specific task is 
accomplished. Layout appropriateness is defined as 

LA = 100 x [(cost of LA — optimal layout)/(cost of proposed layout)] (14) 

where LA = 100 for an optimal layout. 

Maintenance Mt = the number of modules in the current release F c = the number 
of modules in the current release that have been changed F a = the number of modules in 
the current release that have been added = the number of modules from the preceding 
release that were deleted in the current release 

The software maturity index is computed in the following manner: 

SMI = [M T - {F a + F C + F d )]/M T (15) 

As SMI approaches 1.0, the product begins to stabilize. SMI may also be used as param- 
eter for planning software maintenance activities. 

The parameters are not limited as above. Some other parameters may also be used. 
Even fuzzy parametes may also included. 

3.2 Weight Parameter for Each Component 

In our system these parameters are based on neural networks. Assume that wnj, (i = 
1,2, 3, ...,p; j = 1, 2, 3, q; ) are the weight between i-th unit on sensory layer and j-th 
unit on association layer. And, u>2,jfc, (j = 1, 2, 3, q; k = 1, 2, 3, r; ) are the weight 
between j-th. unit on association layer and k-th unit on response layer. Xj represent the 
normalized input variables to the i-th unit on sensory layer and y k represent the output 
values. We apply normalized values of fault level, fault reporter, etc to input values X{. 
Cosider the logistic activation function, sigmod function 

/(*) = TTF^ (16) 

Then the input-out rules of each unit on each layer are 
v 

hj = fC^wt-ijXi) (17) 
t=i 

<i 

Vk = f(^2w 2 ,jkhji) (18) 
i=i 

We apply the multi-layered neural networks by propagation in order to learn the interaction 
among software components [18J. Now as the error in yk may be given as 

T 

tk = \^2(y k -d k ) 2 ) (19) 

k=l 

where dk are the target input values for the output values. We consider the estimation 
and prediction model so that the property of interation among software components ac- 
cumulates on the connection weight of neural networks. Finally, we may obtain the total 
weight parameter p k which represents the level of importance for each component 

p k = **— (20) 

2^k=l 
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3.3 Our Extension to Goel and Okumoto Scheme 

In Goel-Okumoto method, m(t) represents the faults during (0,i), the expected costs of 
fixing faults during the testing and operational phases are c\m{T) and C2(m(t) — m(T)) 
respectively. Further, the testing cost during a time period T is cs(T). If there is a cost 
associated with delay in meeting a delivery plan, such a cost could be included in C3. 

Here we assume that software developement is in muti- version environemt. During the 
developement phase of current version some, fault appears in previous version. It is clear 
that cost to repair that fault goes to previous version's cost, which we could not include 
here. But fault appearing in previous version is nearly equivalent to finding fault is current 
version. The cost for this could not be same as c\. We assume this newly associated cost 
as C4. Now if n(t) represents the faults in previous version during (0, t), the expected costs 
of fixing faults during the testing and operational phases is c^niT). Thus, total expected 
cost is now 

C(T) = cim(T) + c 2 [m{t) - m{T) - n(T)\ + c 3 (T) + c 4 n(T) (21) 



3.4 Component Importance basis Testing 

Now, we decide level of priority on the basis of parameter p^. In order to resolve tie cases 
manual decision may be prefered. If some dependent module should be given much more 
prefernce if its parent module is not tested. After prioritzing the modules, try to find 
optimum cost and time parameters in very similar way to Goel's Model. 

Let T and C be the total time and cost available to release the software. Our aim is to 
the test all the modules within T and C. But if we are not able to do this then at least 
the components with very high priority must be tested. We set the fault tolerance = for 
the first time testing of all the components of a particular category (e.g. Very High) and 
find out actual time and cost for testing. 

If optimal cost and time parameters C*, T* are determined, then we can compute a 
expected cost as limiting factor 5 = f(T,T*,C,C*). i.e. 

where p is odds in in favour of cost. 
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